-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flytekit][2][untyped dict] Binary IDL With MessagePack #2757
Conversation
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2757 +/- ##
===========================================
+ Coverage 66.44% 89.07% +22.62%
===========================================
Files 9 8 -1
Lines 453 357 -96
===========================================
+ Hits 301 318 +17
+ Misses 152 39 -113 ☔ View full report in Codecov by Sentry. |
…ith-message-pack-bytes-2
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
…ith-message-pack-bytes-2 Signed-off-by: Future-Outlier <[email protected]>
@Future-Outlier , why did you close this PR? |
# In Mashumaro, the default encoder uses strict_map_key=False, while the default decoder uses strict_map_key=True. | ||
# This is relevant for cases like Dict[int, str]. | ||
# If strict_map_key=False is not used, the decoder will raise an error when trying to decode keys that are not strictly typed.` | ||
def _default_flytekit_decoder(data: bytes) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we call this _default_msgpack_decoder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we may have other decoder in the future, like _default_utf8_decoder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, pretty good advice, thank you!
self._msgpack_decoder[expected_python_type] = decoder | ||
return decoder.decode(binary_idl_object.value) | ||
else: | ||
raise TypeTransformerFailedError(f"Unsupported binary format {binary_idl_object.tag}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise TypeTransformerFailedError(f"Unsupported binary format {binary_idl_object.tag}") | |
raise TypeTransformerFailedError(f"Unsupported binary format `{binary_idl_object.tag}`") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
@@ -1697,17 +1714,15 @@ def extract_types_or_metadata(t: Optional[Type[dict]]) -> typing.Tuple: | |||
return None, None | |||
|
|||
@staticmethod | |||
def dict_to_generic_literal(ctx: FlyteContext, v: dict, allow_pickle: bool) -> Literal: | |||
def dict_to_binary_literal(ctx: FlyteContext, v: dict, allow_pickle: bool) -> Literal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we still use generic to save the pickle file, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I didn't change the logic for pickle
@@ -1717,7 +1732,7 @@ def dict_to_generic_literal(ctx: FlyteContext, v: dict, allow_pickle: bool) -> L | |||
), | |||
metadata={"format": "pickle"}, | |||
) | |||
raise e | |||
raise TypeTransformerFailedError(f"Cannot convert {v} to Flyte Literal.\n" f"Error Message: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise TypeTransformerFailedError(f"Cannot convert {v} to Flyte Literal.\n" f"Error Message: {e}") | |
raise TypeTransformerFailedError(f"Cannot convert `{v}` to Flyte Literal.\n" f"Error Message: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem, thank you
(typing.Dict[str, int], {"a": 1, "b": 2, "c": 3}), | ||
(typing.Dict[str, typing.List[int]], {"a": [1, 2, 3], "b": [4, 5, 6], "c": [7, 8, 9]}), | ||
(typing.Dict[str, typing.Dict[int, str]], {"a": {"1": "a", "2": "b", "3": "c"}, "b": {"4": "d", "5": "e", "6": "f"}}), | ||
(typing.Dict[str, typing.Dict[int, str]], {"a": {1: "a", 2: "b", 3: "c"}, "b": {4: "d", 5: "e", 6: "f"}}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add one more example for Union[dict, str]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, no problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the great set of unit tests.
@@ -161,6 +163,7 @@ async def test_agent(mock_boto_call, mock_return_value): | |||
if "pickle_check" in mock_return_value[0][0]: | |||
assert "pickle_file" in outputs["result"] | |||
else: | |||
outputs["result"] = msgpack.loads(base64.b64decode(outputs["result"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flytekit/flytekit/interaction/string_literals.py
Lines 44 to 45 in 9a9f5b2
if scalar.binary: | |
return base64.b64encode(scalar.binary.value) |
Because someone serialize the bytes
in the binary IDL here before.
@@ -159,7 +160,7 @@ async def test_openai_batch_agent(mock_retrieve, mock_create, mock_context): | |||
outputs = literal_map_string_repr(resource.outputs) | |||
result = outputs["result"] | |||
|
|||
assert result == batch_retrieve_result.to_dict() | |||
assert msgpack.loads(base64.b64decode(result)) == batch_retrieve_result.to_dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
def test_untyped_dict(): | ||
ctx = FlyteContextManager.current_context() | ||
|
||
dict_inputs = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have tests that include datetime
and other possibly more complex objects too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NO PROBLEM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enum and datetime is not supported in untyped_dict
, and I think this is ok, no one use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other complex objects like data class will not be supported too, since we don't have type hints
Tracking issue
flyteorg/flyte#5318
Why are the changes needed?
We want to support untyped dict with 100% type correct, turn it to
msgpack bytes
can ensure 100% type correct.What changes were proposed in this pull request?
dict_to_generic_literal
todict_to_binary_literal
self.from_binary_idl
intoto_python_val
test_untyped_dict
to test 8 cases of untyped dictHow was this patch tested?
unit tests, local execution and remote execution.
Setup process
Screenshots
There are 8 test cases.
Check all the applicable boxes
Related PRs
Docs link